Picture for Tong Xu

Tong Xu

Mock Worlds, Real Skills: Building Small Agentic Language Models with Synthetic Tasks, Simulated Environments, and Rubric-Based Rewards

Add code
Jan 30, 2026
Viaarxiv icon

Token-level Collaborative Alignment for LLM-based Generative Recommendation

Add code
Jan 26, 2026
Viaarxiv icon

From Tags to Trees: Structuring Fine-Grained Knowledge for Controllable Data Selection in LLM Instruction Tuning

Add code
Jan 20, 2026
Viaarxiv icon

DynaDebate: Breaking Homogeneity in Multi-Agent Debate with Dynamic Path Generation

Add code
Jan 09, 2026
Viaarxiv icon

VIGIL: Defending LLM Agents Against Tool Stream Injection via Verify-Before-Commit

Add code
Jan 09, 2026
Viaarxiv icon

Look As You Think: Unifying Reasoning and Visual Evidence Attribution for Verifiable Document RAG via Reinforcement Learning

Add code
Nov 15, 2025
Viaarxiv icon

TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework

Add code
Nov 07, 2025
Viaarxiv icon

A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning

Add code
Sep 26, 2025
Figure 1 for A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning
Figure 2 for A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning
Figure 3 for A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning
Figure 4 for A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning
Viaarxiv icon

Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy

Add code
Aug 11, 2025
Figure 1 for Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy
Figure 2 for Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy
Figure 3 for Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy
Figure 4 for Verti-Arena: A Controllable and Standardized Indoor Testbed for Multi-Terrain Off-Road Autonomy
Viaarxiv icon

Xiangqi-R1: Enhancing Spatial Strategic Reasoning in LLMs for Chinese Chess via Reinforcement Learning

Add code
Jul 16, 2025
Viaarxiv icon